Whole genome sequencing reveals new links between spa t172/CC59 methicillin-resistant Staphylococcus aureus cases in low-endemicity region of Southwest Finland, 2007‒2016

Methicillin-resistant Staphylococcus aureus (MRSA) rates have remained relatively low in Finland. In Southwest Finland, however, annual MRSA incidence increased from 12 to 25/100,000 between 2007 and 2016 with spa t172 strain causing one fourth (237/983) of all cases. This provoked us to study the molecular epidemiology of t172-MRSA, aiming to better understand the transmission of this strain type. We combined epidemiological data and whole genome sequencing (WGS) of a set of 64 (27%, 64/237) t172-MRSA isolates covering 10 years. Isolates represented sporadic and index cases of all identified healthcare-associated outbreaks (HAOs) and family clusters (FCs). Among the included 62 isolates, core-genome MLST analysis revealed eight genomic clusters comprising 24 (38.7%) isolates and 38 (61.3%) non-clustered isolates. Cluster 1 comprised ten and the remaining seven clusters two isolates each, respectively. Two epidemiologically distinct HAOs were linked in cluster 1. FCs were involved in all clusters. All strains were associated with epidemic clonal complex CC59. We were able to confirm the spread of several successful t172-MRSA subclones in regional healthcare and the community. WGS complemented routine surveillance by revealing undetected links between t172-MRSA cases. Targeted, WGS-based typing could enhance MRSA surveillance without the need for routine WGS diagnostics.

Demographics and epidemiological classification of t172 cases. Based on the available epidemiological information, t172 cases were distributed between 170 (71.7%) healthcare associated (HA-MRSA) and 67 (28.3%) community associated (CA-MRSA) cases (Fig. 2). Within the HA-MRSA group, 74 cases were associated with healthcare associated outbreaks (HAOs), 51 cases with family clusters (FCs), and 45 were defined as other HA cases. Within the CA-MRSA group, 50 cases were associated with FCs and 17 cases were defined as sporadic.  www.nature.com/scientificreports/ All t172 index cases of HAOs (n = 9), FCs (n = 38) and all sporadic t172 cases (n = 17) were selected to the WGS analysis. Among all cases, a total of 26 HAOs and 129 FCs were identified while t172 cases were involved in 12 (46.2%) and 39 (30.2%) of these, respectively. In three HAOs and one FC where t172 cases were involved, the index case was other than t172. These four clusters were not included. The selected cases spread throughout the study period (Fig. 2). Additionally, 16 t172 isolates from cases of HAO investigations were used for comparison in the cgMLST cluster analysis.
The median age of the t172 cases was 58 years (range 0.1-100.9 years), which is significantly higher than other spa types combined (median 40 years, range 0.1-103.4 years, p < 0.001). The t172 cases were not associated unambiguously with either HA-or CA-MRSA groups (Table 1). Within the HA-MRSA group, t172 cases were associated with long-term care and HAOs, whereas association to hospital care abroad was negative. In the CA-MRSA group, t172 cases were negatively associated with sporadicity. Additionally, t172 cases were associated with FCs, but association with immigrant status, livestock contact and travel or working abroad was negative. No differences between t172 and other spa type cases were observed in terms of sex, history of intravenous drug use, HCW status or sample type.

WGS based analysis.
In total, 64 t172 MRSA isolates representing the index cases of the HAOs (n = 9) or FCs (n = 38) and all sporadic cases (n = 17) were selected for WGS. After excluding two samples (CA-MRSA, sporadic case, year 2016: isolate not available, and HA-MRSA, FC index case, year 2013: low quality in WGS), altogether 62 samples were included in the WGS based analyses. An average of 125-fold coverage (range 75-183) and N50 of 79,280 were observed. Sixty isolates were ST375 and two were single locus variants (SLVs) of ST375 (ST59 and ST5428, Table S1). Predicted SCCmec elements included indication of SCCmec type IV(2B) in 60 isolates and type V(5C2&5) in two (SWF_2 and SWF_13) isolates. No discrepancy between WGS based and conventional spa typing was detected. Three technical control duplicate sequences from THL confirmed identical typing results in core-genome multi-locus sequence typing (cgMLST) between the laboratories.
The ten isolates belonging to the largest cluster (cluster 1) represented variable epidemiological backgrounds. The earliest case linked to cluster 1 was a sporadic CA-MRSA case isolated in 2009. Three other sporadic CA-MRSA cases in the cluster were isolated in 2010 and 2012. Six other cases in cluster 1 were HA-MRSA. Of these, two were HAO index cases. Interestingly, these two separately perceived HAOs were revealed to share the same genomic background. These HAOs took place in 2009-2012 (index SWF_16) and 2013-2016 (index SWF_23), respectively. Other four HA-MRSA cases in cluster 1 were from FC index cases isolated in 2010 and 2011. Based   www.nature.com/scientificreports/ on the epidemiological data, 3/4 of these FC index cases were previously linked to the earlier HAO (index SWF_16) in cluster 1 and thus WGS supported these linkages (Table S1). Clusters 2-8 comprised two isolates each. All these clusters involved a FC index case and revealed previously undetected links between FCs, HAOs and sporadic cases (Fig. 3a). In clusters 2 and 4, a FC index case and a sporadic case were linked. In clusters 3, 6 and 7, two separate FC index cases were linked. In clusters 5 and 8, a FC index case and a HAO index case were linked. In cluster 5 there was also a link between the cases based on the epidemiological data and thus WGS supported the link. There were no other previously identified epidemiological links in clusters 2-8. In all clusters with HAO index case involved (clusters 1, 5 and 8), isolation dates suggest transmission direction from healthcare to households (Fig. 3b).
The non-clustered isolates (n = 38, 61.3%) were distributed in the population with  SNVs distance to their genetically closest isolate (Fig. 3). Three additional and two extended clusters would have been formed if we had used a slightly wider clustering cutoff (16 to 19 SNVs, Fig. 3). Otherwise, the dispersion of small clusters in the population was clear (25 or more SNVs to the closest isolate). Only two isolates (HAO index SWF_2 and sporadic SWF_27) differed with over 100 SNVs to their closest isolate. The non-clustered isolates represented 10 sporadic, 23 FC and 5 HAO index cases. These isolates were reported throughout the whole study period. Based on the epidemiological data, one FC index case (SWF_52) was linked to a HAO (index SWF_9), but interestlingy, based on the WGS analysis, they did not share genomic background (pairwise distance 72 SNVs) refuting probable transmission. No other apparent epidemiological links were refuted in the WGS analysis (Table S1).
Additional isolates. In this study, additional WGS sequence data from 16 t172 MRSA isolates from three separate t172 HAO investigations (index cases SWF_2, SWF_13 and SWF_23) were included in the cgMLST analysis as technical controls (n = 3) and for comparison (n = 13) of knowingly clustered isolates. From these 13 additional isolates from two different HAO investigations, three isolates were incorporated to one of the previously mentioned clusters (cluster 1) and 10 isolates formed one additional cluster (cluster 9, Fig. 3). These additional isolates demonstrate how conventional outbreak-based investigations relate to our selection of the local t172 MRSA strains. As seen in cluster 1, not all cases associated with the outbreak were previously detected as they had been isolated up to 7 years before the HAO investigation in 2016 (Fig. 3b). However, cluster 9 included only cases epidemiologically recognized in the HAO investigation in 2015.

Discussion
This study characterized the demographics and molecular epidemiology of t172 MRSA cases in the HD of Southwest Finland during years 2007-2016. Demographic results show, that the t172 cases represented diverse epidemiological groups while a strong indication of primarily domestic transmission was confirmed. Combining epidemiology and a targeted WGS analysis, we were able to assess the clonality of the strains and detect new transmission clusters. Seven new transmission clusters were detected, one confirmed and one refuted using WGS and a diverse t172-MRSA background population was revealed.
In this study, the demographic features of t172 MRSA reflect the domestic nature of transmission, which is also supported by the low reporting frequency of t172 elsewhere in the world (Table 1) 7,9 . The increase of spa type diversity and the decreasing proportion of t172 in the HD of Southwest Finland were detected simultaneously with the increase in the incidence of cases with immigrant status and hospital care abroad as well as a switch towards younger age groups 5 . Especially the increasing prevalence of spa t304 strains in many European countries has been discussed in the context of refuge and immigration from the Middle-East (Iraq, Syria) during the last decade 15,16 . However, t172 has been continuously reported nationally and has been only partially replaced in prevalence by other spa types by the end of 2021 (4th most prevalent spa type, 5% of all Finnish MRSA cases) 4 . Our results also show higher median age and stronger association with both local household and healthcare outbreaks among the t172 cases (Table 1). Therefore, the observed differences in the epidemiology of MRSA strains would argue against the assumption that foreign influx and direct replacement of strain types via competition are solely behind the gradual local decrease in t172 cases. Other explanations possibly contributing to the decline of t172 are changing environmental factors such as infection control practices and changes in the screening schema 17 .
Although t172 strains are infrequently reported outside Finland, related strains are common elsewhere. The single locus variant strain types of ST59, such as ST375 associated with t172, belong to clonal complex CC59. It is a globally successful CA-MRSA lineage with major East-Asian and North-American subclades, which are both reported in Europe 18,19 . In China, CC59 has become one of the dominant lineages in many hospitals 20 . Additionally, CC59 strains have been reported in western Australia, Japan and are among the most prevalent strains in Sweden, Norway and Iceland 7,21,22 . CC59 strains appear to be genetically distinct and not limited to specific regions or host-environments, and generally possess the qualities of high-risk CA-MRSA clones 23,24 . The clonal nature and success of t172/CC59 strain is reflected in our results in its tendency of forming distinct and persistent genetic clusters between community and hospital environments. It is also reflected in the shared CC59 genotype and associated SCCmec elements 18,20 . In the future, a phylogenomic analysis using genomic sequences from a representative take of international CC59 MRSA isolates would help to understand the route of dissemination and clonal origin of t172/CC59 more deeply.
Previously, t172 strains has been isolated from asymptomatic carriers as well as from severe MRSA/MSSA infections 13 . The majority of t172 isolates in our study were from screening samples which reflects the role of asymptomatic carriers in CA-MRSA transmission as well as the MRSA screening policy in the HD 5  www.nature.com/scientificreports/ infection samples and the detailed information about the culture/specimen type was defective. Thus, only five of the above mentioned blood culture isolates were identified and interestingly, one strain from these was t172.
A major proportion of all t172 cases (42.6%) in this study were linked with FCs. In the WGS analysis, all the identified clusters included a FC index case. The role of households in CA-MRSA transmission is acknowledged and household transmission has been recognized as a major source of new MRSA acquisitions in low-prevalence settings [26][27][28] . Our findings reflect the potential of t172 MRSA strains to persist and spread between different environments, a commonly reported observation in epidemiological studies involving CA-MRSA 29 . Although dichotomous HA-/CA-MRSA classification has been questioned, HA-MRSA strains seem to spread more limitedly outside healthcare settings while CA-MRSA clones are commonly reported in healthcare [29][30][31][32] . In addition to detecting transmission clusters, the WGS analysis both confirmed (cluster 9) and refuted (SWF_9, SWF_52) epidemiological links between t172 cases. Importantly, in cluster 1 two epidemiologically distinct HAOs were linked via FC index cases and seven out of eight clusters we identified would not have been noticed at all as part of conventional surveillance due to the temporal distance and lack of epidemiological link between the cases. The resolution of spa typing combined with extensive epidemiological tracing was thus insufficient to distinguish certain outbreaks. Therefore, targeted sequencing strategy of isolates from a limited number of epidemiological groups of known risk (e.g. secondary cases of FCs linked by WGS) can be a potential strategy when trying to limit the spread of certain MRSA strains, such as t172, without the need for routine WGS diagnostics 33 .
WGS has been used widely to confirm, detect or refute hospital outbreaks and previously undetected transmission of MRSA 34 . Although the cost and lack of global nomenclature hinders its adoption as routine diagnostics, software development has extended the use of WGS 35 . Several allelic and SNV cutoffs for clustering closely related MRSA strains have been discussed in the literature depending on the method. Strains with strong epidemiological links tend to differ from each other by under 20 core genome SNVs 28,33,[36][37][38] . Although our approach was sample-centric and based on a previously defined core-genome, we chose to use a conservative 15 core-genome SNV cutoff proposed recently by Coll F. et al 39 . They analyzed over 1000 MRSA cases and concluded 15 core-genome SNVs as the best genetic cutoff to rule out transmission events up to 6 months. This means that some putative transmission events (16 to 20 SNVs) in our data may have been overlooked, as seen between some isolates and two WGS clusters (clusters 6 and 7, Fig. 3). Nevertheless, the use of a stringent SNV cutoff increases the credibility of identified clusters. The isolation dates of clustered isolates ranged from less than a year up to four years, which suggests a local long-term reservoir of clonal t172 strains (Figre 3b). Overall, it can be concluded that the genetic distance of the isolates in this small selection reflects the circulating population of t172 MRSA in the HD, where some subclones have been successful in causing outbreaks despite active preventive measures 20,[40][41][42] .
There are some limitations in our study. The number of isolates included in the WGS analysis limits the power of the study. Carefully defined criteria, however, were used when selecting the isolates to ensure epidemiological diversity during the 10-year study period, and a conservative SNV cutoff was used to avoid overestimation of identity. The availability of extensive epidemiological register data is a strength of this study.
Based on our results, we conclude that t172 MRSA strains can spread between different environments in the community as well as in hospitals and cause persistent hospital outbreaks in a setting with generally low endemicity of MRSA. The t172 strain spread among older patients and was not introduced by patients with foreign healthcare contacts or origin. Endemic transmission of high-risk CA-MRSA strains can be challenging to prevent, which is reflected in the major role of FCs and asymptomatic carriers. Despite the cost-effective detection of diversity among MRSA strains with spa typing, the resolution might be insufficient when looking at transmission links on the individual level. The potential of WGS in outbreak investigations is well established, however, an integrated approach could provide a useful and cost-effective way of tracking high-risk MRSA transmission 33 . Our results encourage development of targeted WGS based MRSA surveillance to accompany screening and risk factor identification to improve detection and prevention of persistent MRSA clone circulation.

Methods
Study material and definitions. The spa t172 MRSA cases and the respective bacterial isolates were identified through our previous retrospective, population-based study covering all 983 new MRSA cases in the HD of Southwest Finland (catchment population of 478 500, representing 8.7% of the Finnish population in 2016) between 2007 and 2016. Seven cases lacking data on the spa type were excluded (n = 976) 5 . No informed consent was asked from the subjects due to the retrospective, register-based nature of the study. All experiments and analyses were performed in accordance with relevant guidelines and regulations.
For each case, the following demographic and background data on possible risk factors were available: age, sex, specimen type (screen/clinical MRSA infection), immigrant status, live-stock contact, intravenous drug use, long-term care residence, healthcare worker (HCW) status, travel or work abroad within the previous year, and hospital care abroad within the previous 2 years. All cases were defined as either healthcare (HA-MRSA), or community associated (CA-MRSA). Cases in HCWs, infants under 28 days of age and patients who had been hospitalized (including long-term facilities) within the last two years in Finland or the previous year abroad were defined as HA-MRSA. Other cases were defined CA-MRSA. HAOs, including hospital and long-term facility outbreaks, were defined as at least two cases in HCWs or patients sharing a room/washing facility in the same unit. FCs were defined as two or more cases living in the same household. Index cases were defined as the first identified case leading to an investigation of respective HAO or FC. Additionally, a case was defined sporadic if it lacked an epidemiological link to healthcare or FCs. HA-MRSA cases without a link to HAO or FC were defined as other HA cases. This detailed epidemiological information was acquired via systematic epidemiological tracking of cases, performed by the local infection control unit.
Currently, all Finnish MRSA isolates are spa typed by The Finnish Institute for Health and Welfare (THL) while HAOs are investigated with WGS upon request 7 . THL reports the typing results to the referring clinical